Consequences of the discontinuation of the International Protein Index (IPI) database and its substitution by the UniProtKB “complete proteome” sets

نویسندگان

  • Johannes Griss
  • María Martín
  • Claire O'Donovan
  • Rolf Apweiler
  • Henning Hermjakob
  • Juan Antonio Vizcaíno
چکیده

The International Protein Index (IPI) database has been one of the most widely used protein databases in MS proteomics approaches. Recently, the closure of IPI in September 2011 was announced. Its recommended replacement is the new UniProt Knowledgebase (UniProtKB) "complete proteome" sets, launched in May 2011. Here, we analyze the consequences of IPI's discontinuation for human and mouse data, and the effect of its substitution with UniProtKB on two levels: (i) data already produced and (ii) newly performed experiments. To estimate the effect on existing data, we investigated how well IPI identifiers map to UniProtKB accessions. We found that 21% of human and 10% of mouse identifiers do not map to UniProtKB and would thus be "lost." To investigate the impact on new experiments, we compared the theoretical search space (i.e. the tryptic peptides) of both resources and found that it is decreased by 14.0% for human and 8.9% for mouse data through IPI's closure. An analysis on the experimental evidence for these "lost" peptides showed that the vast majority has not been identified in experiments available in the major proteomics repositories. It thus seems likely that the search space provided by UniProtKB is of higher quality than the one currently provided by IPI.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-3: Human Y Chromosome Proteome Project 2012 Update

The Human Genome Project has generated a blueprint for the approximately 20,300 gene-encoded proteins potentially active in any of 230 cell types that make up the human body (human proteome). However, based on the UniProtKB/Swiss-Prot database content, about 6000 of at the protein level; for many others, there is very little information related to protein function, abundance, subcellular locali...

متن کامل

The UniProtKB guide to the human proteome

Advances in high-throughput and advanced technologies allow researchers to routinely perform whole genome and proteome analysis. For this purpose, they need high-quality resources providing comprehensive gene and protein sets for their organisms of interest. Using the example of the human proteome, we will describe the content of a complete proteome in the UniProt Knowledgebase (UniProtKB). We ...

متن کامل

The International Protein Index: an integrated database for proteomics experiments.

Despite the complete determination of the genome sequence of several higher eukaryotes, their proteomes remain relatively poorly defined. Information about proteins identified by different experimental and computational methods is stored in different databases, meaning that no single resource offers full coverage of known and predicted proteins. IPI (the International Protein Index) has been de...

متن کامل

Analysis of the tryptic search space in UniProt databases

In this article, we provide a comprehensive study of the content of the Universal Protein Resource (UniProt) protein data sets for human and mouse. The tryptic search spaces of the UniProtKB (UniProt knowledgebase) complete proteome sets were compared with other data sets from UniProtKB and with the corresponding International Protein Index, reference sequence, Ensembl, and UniRef100 (where Uni...

متن کامل

I-49: Human Y Chromosome ProteomeProject

The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2011